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Traditional research methods of recording infant verbal behavior, namely, 
descriptions by a single observer transcribing the utterances of a single infant in a 
naturalistic setting, have been inadequate to provide data necessary for modern 
linguistic analyses. The Center for Research on Language and Language Behavior has 
undertaken to correct this inadequacy. The Center collected permanent, complete, and 
continuous records of all vocalizations of two infants during their first five months of 
life. TNs data was then processed by new electro-acoustic techniques. In processing 
one hundred and eight 95-secpnd vocal behavior samples, the computer determined 
(1) the number of utterances. ? 2 ) the duration of each utterance, and (3) the mean 
and standard deviation of the fundamental frequency and amplitude of each 
utterance. Further statistical analyses are now in progress. (WD) 
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Ui The vocal behavior of an infant during the first few months of life is the matrix for 
later language development. Therefore, the differentiation and organization of infant vocal- 
izing as a function of maturational and environmental factors holds considerable interest. 

Previous research in this area consists almost entirely of descriptions by a single 
observer transcribing the utterances of a single infant in naturalistic settings. Two 
abridged biographical reports will be samples: one from a "pre-objective" period, that of 

the late 19th century, and one from recent decades, employing "more refined techniques" 
(McCarthy, 1946). The earlier writer, a keen observer of behavior, is Charles Darwin. The 
contemporary writer, a linguist, is W. F. Leopold. 

"The noise of crying is uttered in an instinctive manner... After a time the sound 
differs as to the cause, such as hunger and pain. ..he soon appeared to cry voluntarily... 
When 46 days old he first made little noises without any meaning to please himself, and 
these soon became varied... -At exactly the age of a year, he made the great step of inventing 
a word for food, namely mum . . .and now, instead of beginning to cry \dien he was hungry, he 
used this word in a demonstrative manner. . .implying 'Give me food'" (Darwin, 1877, p. 285). 

"During the first few weeks the only sounds produced were cries of dissatisfaction... 
in the seventh and eighth weeks the sounds ceased to be purely incidental. She uttered more 
arbitrary sounds of satisfaction. . .cooing as an articulated expression of feelings of satis- 
faction was therefore well established by the end of the second month... By the seventh mont 
there was a good deal of babbling prevalently ranging from [a] to [e], long, without many 
tongue movements... /. the end of the eleventh month, her active vocabulary consisted of 
two words" (Leopold, 1939, p. 72). 

Linguistic studies in this field have focused on providing a description of the languag 
f a particular child at different levels of development. The units of analysis have been 
honemes, morphemes, words and sentences. The procedures for data collection usually begin 
ith phonetic transcription of the infant's vocal behavior by a trained observer. The most 
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extensive studies of this type are those of the linguists Gregolre, Leopold, Cohen, and 
/elten. The work of Irwin and Chen in the 1940 's is notable for the refinements they intro- 
duced. These investigators began the practice of using two observers rather than one; this 
allowed them to obtain measures of observer agreement. They also increased the numbers of 
subjects observed and selected a more adequate sample of subjects. 

Psychologists working in this area have tended to study more molar aspects of language 
development. The main interest has been to provide normative data on various indices of 
development, e.g., use of form class, size of vocabulary, mean sentence length, and the like. 
The procedure for data collection is again transcription (usually alphabetic) by a trained 
observer using cross-sectional and longitudinal sampling. Among the classic studies of this 
type are those of Davis, Fisher, McCarthy, Shirley, Templin, and Lewis. 

The most striking feature of the research literature on Infant vocalizing is the lack 
of advancement in the field. The basic drawback is the reliance on transcriptions obtained 
by observations in naturalistic settings. Consider the difficulty of transcribing infant 
speech sounds. Because the infant is in the process of learning to articulate, the sounds 
he utters are unlikely to fit neatly into any class if icatory system. There is the danger 
that the trained observer filters the variagated vocal behavior through his own classlficatorj 
categories — categories developed with adult vocal behavior — and thus rejects much of impor- 
tance. Moreover, the infant utters sounds rapidly and sporadically, making it difficult 
for the observer to keep an accurate or complete record. 

The obvious sampling problems have also limited the generality of the findings of 
previous studies. It has been recognized that when investigator and subject are also 
mother and child experimental rigor receives little nourishment. In those less numerous 
studies in which the investigator has intrudeu into the home, the schedule of observation 
and transcription usually has been to be most charitable, unsystematic. In summing up the 
results reported up to 1941 Irwin says: 

"It will be apparent from this review of the more Important studies. . .that there does 
not exist a large body of data secured from adequate samplings of infants for purposes of 
a statistical analysis... Usually no systematic research methods were formulated; statistical 
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techniques essential to the analysis of mass data are practically absent, no reliabilities 
of observers have been established, many observers used alphabetical rather than phonetic 
systems of symbols for recording; and most reports Indulge In an Inordinate amount of 
Interpretation supported by very little empirical material." (Irwin, 1941, p. 285). 

In her classic review of the literature on language development In the *:hlld, McCarthy 
writes: "Although this wealth of observational material has proven stimulating and sugges- 

tive for later research workers. It has little scientific merit, for each of the studies 
has employed a different method, the observations have for the most part been conducted on 
single children who were usually either precocious or markedly retarded In their language 
development; the records have been made under varying conditions, and most of the studies 
are subject to the unreliability of parents' reports" (McCarthy, 1946, p. 478). 

Attempts to ameliorate one or more of the methodological problems that we have reviewed 
have waited upon advancements In Instrumentation. In Part II of her 1929 review McCarthy 
described some of the early attempts to record speech by means other than transcription by 
an observer. The first device she described was the manometrlc flame. Invented by Koenig 
In 1862. With this device, the flame of a gas jet Is disturbed by the sound wave and these 
disturbances are recorded photographically. The pnonautograph. Invented by Scott In 1859, 
recorded the speech wave by means of a stylus attached to a diaphragm. The best device of 
this general type was the phonodeik, vdilch consisted of a horn and a diaphragm; a small 
platinum wire attached at one end to the diaphragm was passed around a jewel-mounted spindle 
to a delicate spring. Attached to the spindle was a tiny mirror which reflected a fine 
beam of light onto a moving photographic film. The phonodeik could respond up to 10,000 cps 
but the horn and diaphragm Introduced a certain amount of distortion. 

In 1893 Blondel had devised the oscillograph and with improvements In amplifiers It 
promised to be a very useful piece of apparatus. However, even before the oscillograph had 
been developed to the point where It gave a practically perfect representation of the sound 
wave. It became clear that, once recorded In all Its complexity, the waveform was virtually 
Impossible to analyze In ways that were useful for studying speech. 




Sound recording devices, the phonograph and later the wire and tape recorders, even when 
developed further than they were in 1929, solved only part of the problem. The actual speech 
sound’ could then be recorded and replayed but an observer still had to rely on his judgment 
as a percelver of sound and speech in order to analyze the data. 

Two years after the introduction of the sound spectrograph, in 1949, Lynip published a 
report of the use of this device in the study of infant speech. He recorded the speech 
sounds of a little girl, beginning with the birth cry and sampling at intervals ending at 
56 weeks when intelligible speech began. A spectrographic analysis of these recordings 
revealed nothing which resembled the spectrograms of phonemes produced by adults. 

In 1960, Winitz published a data-garnished polemic on the subject of the spectrographic 
analysis of infant vocalization. He argued that the fact that the infant's spectrograms 
didn't look like spectrograms of adult speech simply proved that the spectrograph isn't a 
good device for the study of infant vocalizations. "The basic data against which any 
instrumental method of phonetic analysis must be validated are the phonetic analyses made 
by competent observers whose validity Lyilip questions" (Winitz, 1960, p. 173). 

Without resolving the question of the validity of spectrographic description, it must 
be acknowledged that for each two-second sample of vocalization approximately fifteen minutes 
are required to process, calibrate, crudely quantify, and classify each spectrogram— and 
that the problems of observer interpretation remain, 'Ithovg^i transferred from auditory to 
visual modes . 

This brief account of the techniques that have been employed so far for collection and 
acoustic analysis of infant vocal behavior indicates that an extension of our knowledge of 
vocal development requires new techniques. .Accordingly, the Center for Research on Language 
and Language Behavior undertook a year ago to collect permanent, complete and continuous 
records of all vocalizations of two infants, and then to process these records by novel 
electro-acoustic techniques. 

Recording and sampling of vocalizing. During the deliveries of the infants whose 
vocalizing is the subject of this study, medical personnel wore lapel microphones whose 
outputs were recorded on magnetic tape. All subsequent vocalizing by the infants at the 
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hospital was recorded by placing them In private rooms containing a microphone wired to a 
fast-acting voice-operated switch (Mlratel) and to a tape recorder (Tandberg) . After leaving 
the hospital, both children were cared for at home In plexiglass "air cribs" (T.M.I.) that 
provided no sources of sound within the crib and attenuation of external sounds — hence, a 
good recording environment. The parents of the children (research assistants at the Center 
In both cases) were paid to keep a detailed record on prepared forms of major environmental 
events affecting the Infant. These records were synchronized with the tape recordings by 
writing down the reading of the footage Indicator on the tape recorder. 

Complete recordings of all vocal behavior during the first five months of life const!- 
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tute a formidable tape library, which was sampled for analysis In the following way. A 
master tape was prepared for each child which contained three 95-sec samples of the vocal 
behavior during every fourth day of life for the first 141 days. For the samples taken from 
the first month (In which the Infant had no regular sleeping times) the three dally samples 
were excerpted from the recordings for 12 a.m. to 8 a.m., 8 a.m. to 4 p.m. , and 4 p.m. to 
12 p.m., respectively. This was accomplished by listening to the recording, beginning at 
the start of each period, and copying the first 95 sec onto the master tape. In some cases 
undeslred noises Intruded and the first 95 sec excluding these Intrusions was copied. For 
the samples of vocalizing in the following three months, the three dally periods from which 
95-sec samples were taken were: time at awakening (T) to T + 4 hours; T + 4 hours to 
T + 8 hours and T + 8 hours to T + 12 hours. These sampling procedures yielded 108 95-sec 
samples for the Initial acoustic analysis. 

Analysis of the prosodic features. The development of the prosodic features of the 



nfant's vocal behavior was analyzed by extracting three acoustic parameters of the vocal- 
"i^^^zing during each of the 108 samples, using analog electronic devices. The outputs of these 









parameter extractors were sampled every 25 sec by an analog- to-digital converter, then 
processed by an on-line digital computer (PDP-4, Digital Equipment Corp.). 

The changing fundamental frequency of the vocalizing was extracted by filtering tape- 
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recorded signals into two frequency ranges. Since the harmonics of the fundamental frequency 
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ften have more energy than the fundamental itself, a range-control voltage is generated 



o 

ERLC 



when there is energy In the lower range which turns off the upper range to exclude the 
harmonics. If no energy exists In the lower range, however, the fundamental frequency In 
the upper range Is processed unimpeded. In either case, the nearly sinusoidal output from 
the filters Is amplified in a mixer and read on a frequency meter which provides a DC vol- 
tage output proportional to the frequency of the fundamental sine wave at Its input. A 
DC amplifier then adjusts the voltage range and polarity for Input to the computer. 

The changing amplitude envelope of the vocalizing was extracted by applying the 
recorded signals to a full-wave rectifier followed by a low-pass filter. The output of 
this device is a DC voltage that is proportional to the absolute value of the amplitude of 

I 

the vocal waveform (Integrated over approximately one period of the fundamental) . 

The duration of each utterance within a sample, the third prosodic parameter, was 
determined In the computer by processing the Input from the amplitude extractor. When the 
amplitude dropped below a threshold value and remained there longer than the silence threshol 
(t^) four out of five samples, the end of an utterance was logged at the time of the 
Initial drop. The start of a new utterance was recorded when the amplitude exceeded threshol 
Again. 

In addition to defining the beginning and end of utterances, the computer performed the 
following preliminary processing. Whenever the amplitude fell below a minimal threshold 
value or the frequency fell below a minimal value in a 25 msec sample, the values | 

of a and ^ were set to zero. This eliminated spuriously low readings due to noise as well 
as vocal sounds without voicing at the glottis and hence without prosodic value. It also 
eliminated false frequency readings that would result from the rise-decay time of the fre- 
quency meter in response to Instantaneous onset or cessation of voicing. 

After sampling and then correcting the amplitude and frequency inputs in this fashion, 

the digital values were reconverted to voltages and plotted as a function of time on a strip- 

chart recorder. These records of the amplitude and frequency contours after preliminary 

processing were compared with those obtained directly from the parameter extractors (before 

computer processing) so as to choose values of a , f , and t that did not distort the 

0 0 0 

original records. 
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After the preliminary processing, the computer determined, for each 95-sec sample, the 
number of utterances as defined above, the duration of each utterance; and the mean and 
standard deviation of the fundamental frequency and amplitude of each utterance. Pooling 
these statistics for each of the utterances in a sample, the computer determined next their 
frequency distributions over the entire sample. These frequency distributions were found 
to be highly right-skewed. A logarithmic transformation was then applied in order to elimi- 
nate the skewness and thus to normalize the distributions. Hence, all statistics were 
computed using the logarithms of the frequency, amplitude or duration values. The computer 
determined next the means and standard deviations associated with these transformed com- 
posite distributions. Consequently, there were two kinds of statistics reported for each 
95-sec sample: (1) within utterance measures of central tendency and variability, averaged 

over utterances, and (2) between utterance measures of central tendency and variability . 

All in all, these composite statistics for each sample were printed out along with their 
frequency distributions: 

(let M « mean, S - standard deviation, f » fundamental frequency, a ■ amplitude, 

j 

d B duration.) 

M(MF) - overall fundamental frequency 

S(MF) - variability between utterances in fundamental frequency 

M(Sf) - overall variability within utterances in fundamental frequency 

S(Sf) - variability between utterances in the variability within utterances in 
fundamental frequency 

M(Ma), S(Ma), M(Sa) , S(Sa) - as above but for the amplitude parameter 

Md - mean utterance duration 

Sd - variability in utterance duration 

These statistics, describing the three prosodic features of the vocalizing in each 
sample, are then plotted separately as a function of age at time of the sample, with the 
time of day (in three intervals) as a parameter. In this way, developmental trends may be 
discerned in the prosodic features of the infant's vocal behavior. 
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SAMPLE NUMBER 



Results. In Fig. 1 the average fundamental frequency of utterances, M(Mf) 
and the coefficient of variation between utterances in fundamental frequency, 
CVF^y, [=S(Mf)/M(Mf) ] are presented as a function of sample number. An examine 

Insert Fig. 1 about here 

tion of the developmental changes over the first 108 samples (141 days) shows 
that the average fundamental frequency M(Mf) at birth was approximately 450 cps 
that it decreased to 370 cps by sample number 33 (approximately 45 days), and 
that it then rose and stabilized at about 450 cps for the duration of the 
study. The coefficient of variation between utterances remained small and 
constant at between .01 and .03 over the entire study. 

In Fig. 2 the average duration of utterances in msec and the coefficient 
of variation of duration (CVD^^ = Sd/Md) are presented as a function of sample 
number. The average duration ranges from 100 msec to 800 msec over the 108 
samples. No developmental trend is apparent although the variability from 
sample to sample does decrease T;:*th age. The coefficient of variation of 
duration remains constant at about .20 over the entire sample. 

Insert Fig. 2 about here 

Further examination of developmental trends awaits the completion of 
further statistical computations now in progress with the present data. 



Footnote 



This paper was presented at the First Symposium of the Development of 
Language Functions, sponsored by the Center for Human Growth and Development, 
University of Michigan, on October 20-22, 1965. 
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